Seismic data denoising through multiscale and sparsity-promoting dictionary learning
نویسندگان
چکیده
Seismic data comprise many traces that provide a spatiotemporal sampling of the reflected wavefield. However, such informationmay suffer from ambient and random noise during acquisition, which could possibly limit the use of seismic data in reservoir locating. Traditionally, fixed transforms are used to separate the noise from the data by exploiting their different characteristics in a transform domain. However, their performance may not be satisfactory due to their lack of adaptability to changing data structures. We have developed a novel seismic data denoising method based on a parametric dictionary learning scheme. Unlike previous dictionary learning methods that had to learn unconstrained atoms, our method exploits the underlying sparse structure of the learned atoms over a base dictionary and significantly reduces the dictionary elements that need to be learned. By combining the advantages of multiscale representations with the power of dictionary learning, more degrees of freedom could be provided to the sparse representation, and therefore the characteristics of seismic data could be efficiently captured in sparse coefficients for denoising. The dictionary learning and denoising were processed from all overlapping patches of the given noisy seismic data, which maintained low complexity. Numerical experiments on synthetic seismic data indicated that our scheme achieved the best denoising performance in terms of peak signal-to-noise ratio and minimizes visual distortion. INTRODUCTION Seismic data quality is vital to geophysical applications. The ideal noiseless data acquisition environment may only exist in a synthetic model. However, real seismic data suffer from different sources of noise in marine and land acquisition. In a seismic survey, which is a primary tool of exploration geophysics, noise caused by ground roll, drill rig, and seismic vessels, to name a few sources, can degenerate the subsurface imaging quality of migration. To maximize the contribution of seismic imaging in hydrocarbon reservoir locating, characterizing, and recovering, one of the key components of seismic data processing is noise suppression or removal. A more challenging scenario is microseismic monitoring, where very low amplitude data are typically acquired with strong coherent and random noise. The low signal-to-noise ratio (S/N) here invalidates some traditional detection and location method, such as algorithms based on arrival time picking. Traditionally, seismic data noise is segregated from the recorded data in various domains, in which the signal and noise have different characteristics (Chen and Ma, 2014). Typical time-domain methods include stacking (Liu et al., 2009), polynomial fitting (Liu et al., 2011), and prediction (Abma and Claerbout, 1995). Frequency-domain methods, such as prediction (Canales, 1984) and filtering (Gulunay, 1986), exploit the predictability of the seismic signal in the spatiotemporal domain. Although the aforementioned methods are capable of attenuating the noise to a certain degree, it is also desirable for the denoising outputs to haveminimal signal distortion, especially in lowS/N situations (Harris andWhite, 1997). In the past decade, the study of sparse signal representation has evolved rapidly to providemethods that capture the useful characteristics of the signal in various domains, as well as providing a new perspective for denoising. The stationary wavelet transformwas used in Chanerley and Alexander (2002) as an alternative to band-pass filtering, and it removed corrupting signals for a better estimate of true groundmotion. A newwavelet frame based on the characteristics of seismic data was proposed by Zhang and Ulrych (2003) to suppress noise. The 2D isotropic wavelets can represent point-like features with sparse coefficients, although the lack of directional selectivity limits their ability to sparsely represent edges and curved features in 2D signals, which are evidently contained in 2D seismic data. Manuscript received by the Editor 31 January 2015; revised manuscript received 9 May 2015; published online 28 August 2015; corrected version published online 19 October 2015. Georgia Institute of Technology, Center for Energy and Geo Processing and KFUPM, Atlanta, Georgia, USA. E-mail: [email protected]; entao.liu@ ece.gatech.edu; [email protected]. © 2015 Society of Exploration Geophysicists. All rights reserved. WD45 GEOPHYSICS, VOL. 80, NO. 6 (NOVEMBER-DECEMBER 2015); P. WD45–WD57, 12 FIGS., 2 TABLES. 10.1190/GEO2015-0047.1 D ow nl oa de d 10 /2 0/ 15 to 1 43 .2 15 .1 12 .1 64 . R ed is tr ib ut io n su bj ec t t o SE G li ce ns e or c op yr ig ht ; s ee T er m s of U se a t h ttp :// lib ra ry .s eg .o rg / Toward this end, the hyperbolic Radon transform has been adopted to represent seismic reflections with sparse coefficients (Ji, 2006), and it has been used for deblending seismic data in blended source acquisition (Ibrahim and Sacchi, 2014a, 2014b). Furthermore, a family of directional multiscale transforms, such as the curvelet (Candès and Donoho, 2004; Candès and Demanet, 2005; Candès et al., 2006) and contourlet (Do and Vetterli, 2003, 2005) transforms are introduced to exploit directional details and geometric smoothness along 2D curves and 3D surfaces. Their basis functions are localized in different scales, positions, and orientations. Because seismic data are known to have smooth, anisotropic contours particularly in reflected wavefronts, these multiscale geometric analysis methods use substantially fewer coefficients to represent seismic data than wavelets for a given accuracy, and they have attracted a great deal of attention for seismic data denoising (Hennenfent and Herrmann, 2006; Neelamani et al., 2008; Hennenfent et al., 2010). The analytic transform mentioned above is a model-driven process based on a formulated mathematical model of the data, leading to an implicit dictionary described by a structured algorithm. Alternatively, a data-driven process learns its dictionary as an unstructured and explicit matrix from a training set, in such a way that each data signal can be represented as a linear combination of only a few columns (atoms) of the matrix. Typical dictionary learning algorithms range from principal component analysis (PCA) (Jolliffe, 2002), generalized PCA (Vidal et al., 2005), to the method of optimal directions (Engan et al., 1999) and the K-singular value decomposition (K-SVD) (Aharon et al., 2006). Dictionary learning methods avoid choosing a fixed dictionary in which some atoms might be of limited use, and therefore offer refined dictionaries that adapt the structure of the data. This approach yields better performance in many image-based applications, such as image denoising (Elad and Aharon, 2006; Mairal et al., 2008), superresolution (Yang et al., 2008), etc. The dictionary learning method has recently been successfully applied to seismic data denoising (Tang et al., 2012; Beckouche and Ma, 2014), as well as on seismic data deblending (Zhou et al., 2014). However, these applications come with the price of a high overhead including explicit storage and multiplication of unstructured dictionary matrices. In this paper, we propose a novel denoising scheme for seismic data based on its sparse representation over a learned dictionary. The dictionary is trained by a variant of the K-SVD algorithm, named sparse K-SVD (Rubinstein et al., 2010), over a set of seismic data patches. The motivation of sparse K-SVD is that the learned dictionary atoms from K-SVD may still share some underlying sparse pattern over a generic dictionary. Therefore, Rubinstein et al. (2010) suggest constructing the effective learned dictionary D 1⁄4 ΦA as a multiplication of a base dictionary Φ corresponding to a fixed analytic transform (e.g., discrete cosine transform [DCT] in Rubinstein et al., 2010) by a sparse matrix A actually to be learned. By relieving the need to learn all elements in D, such a parametric model of the learned dictionary strikes a good balance among complexity, adaptivity, and performance. The algorithm we propose is summarized as follows: We start from a base dictionary Φ and learn a sparse matrix A from the noisy seismic data patch set. For each overlapping patch of seismic data, a sparse coding problem with respect to the effective learned dictionaryD 1⁄4 ΦA is solved to perform denoising. Then, all patched results are reconstructed, tiled, and averaged to assemble the denoised seismic data. We also compare our schemewith those using sparse approximations for seismic denoising based on curvelets and contourlets. Experimental results indicate that our scheme achieves the best performance in terms of peak signal-tonoise ratio (PSNR) and produces the least visual distortion. The rest of this paper is organized as follows: In the second section, we introduce the motivation of the dictionary model with double sparsity and the details of the sparse K-SVD algorithm. The following section describes patch-based seismic data denoising using the learned dictionary. Learning separate multiscale dictionaries and performing denoising in different subbands of a multiscale transform are also presented in this section. Numerical experiments are given in the next section, followed by a discussion and the conclusion in the final two sections. SIGNAL REPRESENTATION WITH DICTIONARY LEARNING A dictionary model with double sparsity Given a training set Y 1⁄4 1⁄2y1; y2; : : : ; yR ∈ RN×R, in which each element is a column vector of length N, the goal of the dictionary learning process is to find a matrixD ∈ RN×L that is able to represent the training set Y with a set of sparse coefficients X 1⁄4 1⁄2x1; x2; : : : ; xR ∈ RL×R by solving the following optimization problem: fD̂; X̂g 1⁄4 argmin D;X kY − DXkF; subject to ( ∀i∶ kxik0 ≤ t; ∀j∶ kdjk2 1⁄4 1; (1) where each column vector xi of length L is the sparse representation of yi overD, k · k0 is the l0-norm that counts the nonzero entries of a vector. The atom normalization constraint kdjk2 1⁄4 1 is added to increase the robustness of the dictionary; however, it does not essentially change the problem. Though the l0-norm optimization problem 1 is generally NP-hard and cannot be tackled directly, it can be relaxed into the following tractable problem by replacing the l0-norm with the l1-norm fD̂; X̂g 1⁄4 argmin D;X kY − DXkF; subject to ( ∀i∶ kxik1 ≤ t; ∀j∶ kdjk2 1⁄4 1. (2) Such a convex relaxation yields an exact solution to the l0-norm optimization problem 1 under certain conditions specified in Donoho (2006), and thereafter, we use the l1-norm to measure the sparsity level. For those applications using natural images, the dimensions N of the training signal fyig and L of the sparse coefficients fxig are large. Any attempt to learn a full-size dictionary requires a huge number R of training signals as well, yielding an intractable computational complexity of solving the l1-norm optimization problem 2. This difficulty also applies to seismic applications, where the recorded data from a single shot may include hundreds of traces and thousands of samples per trace. To learn the dictionary D with moderate computational effort, patch-based (with dimensions in the order of 10) processing is commonly used. This means that the WD46 Zhu et al. D ow nl oa de d 10 /2 0/ 15 to 1 43 .2 15 .1 12 .1 64 . R ed is tr ib ut io n su bj ec t t o SE G li ce ns e or c op yr ig ht ; s ee T er m s of U se a t h ttp :// lib ra ry .s eg .o rg / training set must be composed of small patches from the input data, and the patches should be overlapped to avoid blocking artifacts. So, what could be an appropriate dictionary to represent seismic data? Before attempting to answer this question, we must first understand seismic data. A seismic data set is a collection of data traces, each one of which is a continuous wave recorded by a seismic-recorder-like geophone from a seismic source, either a man-made seismic event or a naturally occurring earthquake. Many traces together provide a spatiotemporal sampling of the reflected wavefield, which contains a straight line and hyperbolas that correspond to direct ray and reflections with normal moveouts, respectively. Figure 1b demonstrates an example of a learned dictionary of 100 atoms by the K-SVD algorithm (Aharon et al., 2006) on a set of 16 × 16 patches from a synthetic seismic data set shown in Figure 1a. Although there are no constraints posed by the algorithm, we can notice the strong resemblance among atoms in the resulting dictionary, which suggests that the sparse representation can be extended to each atom itself over some predefined base dictionary. Therefore, we can express the dictionary as
منابع مشابه
Sparse-promoting Full Waveform Inversion based on Online Orthonormal Dictionary Learning
Full waveform inversion (FWI) delivers high-resolution images of the subsurface by minimizing iteratively the misfit between the recorded and calculated seismic data. It has been attacked successfully with the GaussNewton method and sparsity promoting regularization based on fixed multiscale transforms that permit significant subsampling of the seismic data when the model perturbation at each F...
متن کاملLearning a collaborative multiscale dictionary based on robust empirical mode decomposition
Abstract. Dictionary learning is a challenge topic in many image processing areas. The basic goal is to learn a sparse representation from an overcomplete basis set. Due to combining the advantages of generic multiscale representations with learning based adaptivity, multiscale dictionary representation approaches have the power in capturing structural characteristics of natural images. However...
متن کاملSpeech Enhancement using Adaptive Data-Based Dictionary Learning
In this paper, a speech enhancement method based on sparse representation of data frames has been presented. Speech enhancement is one of the most applicable areas in different signal processing fields. The objective of a speech enhancement system is improvement of either intelligibility or quality of the speech signals. This process is carried out using the speech signal processing techniques ...
متن کاملA Novel Image Denoising Method Based on Incoherent Dictionary Learning and Domain Adaptation Technique
In this paper, a new method for image denoising based on incoherent dictionary learning and domain transfer technique is proposed. The idea of using sparse representation concept is one of the most interesting areas for researchers. The goal of sparse coding is to approximately model the input data as a weighted linear combination of a small number of basis vectors. Two characteristics should b...
متن کاملGraph regularized seismic dictionary learning
A graph-based regularization for geophysical inversion is proposed that offers a more efficient way to solve inverse denoising problems by dictionary learning methods designed to find a sparse signal representation that adaptively captures prominent characteristics in a given data. Most traditional dictionary learning methods convert 2D seismic data patches or 3D data volumes into 1D vectors fo...
متن کامل